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Abstract —In this paper we investigate the problem of optimal 
MDS-encoded cache placement at the wireless edge to minimize 
the hackhaul rate in heterogeneous networks. We derive the 
hackhaul rate performance of any caching scheme based on 
file splitting and MDS encoding and we formulate the optimal 
caching scheme as a convex optimization problem. We then 
thoroughly investigate the performance of this optimal scheme 
for an important heterogeneous network scenario. We compare 
it to several other caching strategies and we analyze the influence 
of the system parameters, such as the popularity and size of the 
library flies and the capabilities of the small-cell base stations, 
on the overall performance of our optimal caching strategy. Our 
results show that the careful placement of MDS-encoded content 
in caches at the wireless edge leads to a significant decrease of 
the load of the network backhaul and hence to a considerable 
performance enhancement of the network. 

I. Introduction 

Caching content at the wireless edge is a promising tech¬ 
nique for future 5G wireless networks m. One of the main 
motivations behind the idea of edge caching comes from the 
possibility of significantly reducing the backhaul usage and 
thus the latency in content retrieval by bringing the content 
closer to the end users. Exploiting the new capabilities of 
future multi-tier networks El, numerous works have recently 
investigated the potential benefits of caching data in densely 
deployed small-cell base stations (SBS) equipped with storage 
capabilities El- In il it was shown that caching at the edge 
can provide significant gains in terms of energy efficiency 
which is considered a fundamental metric for future wireless 
networks. 

The investigation of caching from an information-theoretic 
point of view is presented in Q, where caching metrics are 
defined and analyzed for large networks. In ||6l the problem 
is studied in terms of outage probability. Moreover, other 
embodiments and advantages of edge caching have been 
discussed in the literature. In JT], the idea of using the mobility 
of users in the network to increase caching gains is proposed, 
while in El the authors leverage the possibility of exploiting 
the storage capabilities of mobile users by caching content 
directly on the users’ devices. 

In order to improve the theoretical performance limits of 
caching, a new interesting perspective stems from using ideas 
from coding theory. In |j9], network coding (NC) is exploited 
in a scenario where each user is equipped with a cache. Eor 
this scenario, the authors show that a network coding based 
caching scheme decreases the backhaul load in comparison 
to the usual uncoded schemes. However, the extension of NC 


to a more complex scenario, where each user can be served 
by multiple SBSs, has been proved to be NP-complete ifTOI . 
Moreover, the use of NC in a scenario where the information 
is sent via unicast transmission does not give any advantage. 
Another promising related direction is the use of maximum- 
distance separable (MDS) codes to virtually increase the 
storage size in case of overlapping coverage areas. In im, 
content is conveyed to distributed helpers using MDS codes 
to remedy the limitations of uncoded delivery in terms of 
delay. Assuming the network topology is known, the content 
placement strategy is formulated as a convex minimization of 
the overall delay. A random caching scheme for a cooperative 
multiple-input multiple-output (MIMO) network is combined 
with MDS encoding of the cached content to increase the gains 
resulting from MIMO cooperation ifT^ . 

Departing from those previous works, in this paper we inves¬ 
tigate the performance of MDS-encoded distributed caching at 
the SBSs. In particular, we consider a scenario where mobile 
users can be served by multiple SBSs. In this setup, we derive 
the joint optimal encoding and placement of the encoded 
packets at the SBSs in order to minimize the backhaul rate. 
Our main contributions are: 

• We formally define the problem of caching MDS-encoded 
content at the wireless edge for a heterogeneous network 
(HetNet) scenario. 

• We derive the backhaul rate performance of a caching 
scheme based on storing MDS encoded packets, formulat¬ 
ing the optimal caching scheme as a convex optimization 
problem. 

■ We investigate the performance of the proposed packet 
placement and delivery scheme for a relevant HetNet 
scenario. Moreover, we compare it to several caching 
schemes, analyzing the benehts of MDS encoding on the 
performance of the scheme, as well as the importance of 
carefully optimizing the placement of the coded packets. 

• Our results show that the use of MDS codes in the 
placement of content at the wireless edge yields to 
significant offloading of the network backhaul. 

This paper is organized as follows. In Section|IIl we define our 
system model, caching scheme and performance measures. In 
Section Uni we derive the main theoretical results of the paper. 
In Section |IV] we thoroughly investigate the performance of 
our optimal schemes and we compare it to other caching 
schemes in a heterogeneous network scenario. Finally, Section 
[Vl concludes this paper. 
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Fig. 1. Heterogeneous network. 

II. System Model 

In this section we describe our system model and we 
formally define caching schemes for heterogeneous networks. 

A. Network Model 

We consider the network illustrated in Fig. [T] where U 
wireless users send requests to download files from a macro¬ 
cell base station. The macro-cell base station (MBS) has access 
to a library of N hies = {fi,... ,Fm}, each of size B 
bits. Note that the assumption of equal size hies is justihable 
in practice since hies can be divided into blocks of the same 
size. We suppose that the users request entire hies stored in the 
library, i.e., if ku is the data request for user u, then /c„ G T. 
Each hie has a different probability to be requested by the 
users. In particular, the probability distribution vector of the 
hies is denoted by p = {pi,... ,pn}, referred to as the hie 
popularity, where hie Fj is requested with probability pj and 



Moreover TVsbs small-cell base stations are deployed in 
order to serve user requests within short distance. We assume 
that each SBS has a cache of size M ■ B bits (i.e., it can 
store up to M < complete hies), and can send data to the 
users through an error-free link, i.e., we assume that content is 
delivered without errors as long as a user is within the coverage 
range. We denote by r the coverage range of the SBSs. Each 
user requesting for hies in F is initially served by the SBSs it 
can contact. If the requested hie is not completely present in 
the caches, the MBS has to send the missing data to the user, 
using the backhaul connection. The topology of the overall 
network may vary during time; in particular at each instant 
t, the connection network can be described as illustrated in 

Fig.m 



Fig. 2. Instantaneous Heterogeneous network topology. 


Each user u G can be served by du SBSs 

depending on its location in the area (see Eig.O. In particular, 
we call 7 i the probability for a user to be served by = i 
SBSs. This probability depends on the SBSs deployment and 
on the density of the users; in Section IIV] we will show how 
to evaluate this probability in any scenario. 

B. Caching Schemes 

A caching scheme is constituted of two phases, namely the 
placement phase and the delivery phase. 

1) Placement phase: In this phase, the caches of the SBSs 
are hlled according to the placement strategy. This phase 
typically occurs at a moment with a low amount of hie 
requests, e.g. at night. 

2) Delivery phase: In this phase, the users send requests 
to the MBS. These requests are initially served by the 
SBSs covering their locations. If a user cannot collect 
enough information to recover the hie, the MBS has to 
deliver the missing information through the backhaul. 

Before the placement phase, each hie in the library is split into 
n fragments, i.e., Fj = {/P\ • • • , fn^} for all 1 < j < iV. In 
a coded caching scheme, these fragments are used to create Ej 
encoded packets {e^ , • • • , ef). We want to highlight that a 
fragment is an uncoded part of a hie, while a packet is an 
encoded part of a hie. In the following, a part of a hie can 
denote either a fragment or an encoded packet. 

In the placement phase, each SBS receives rrij parts of 
hie Fj, where m = [mi • • • WAr] is referred to as the cache 
placement. The placement problem consists of hnding the 
optimal number of parts mj of each hie to be stored in the 
caches in order to minimize the average backhaul rate, which 
is dehned as the average fraction of hies that needs to be 
downloaded from the MBS during the delivery phase. 



























In the following, we study the performance of the proposed 
MDS coded scheme in terms of average use of the backhaul 
link, and we compare the results with the uncoded scheme. 

B. Performance Analysis 

To begin with, we calculate the average backhaul rate of 
the proposed MDS coded caching scheme. 

Proposition 1. The average backhaul rate for an encoded 
caching placement scheme dehned by the placement 

m = [mi • • • mjv] can be calculated as 


S N 

= EE y^Pj ( 1 - min ( 1, 


dm. 


( 1 ) 


III. Performance Analysis of Caching Schemes 

In this paper, we propose to store MDS coded packets at the 
SBSs instead of fragments or entire hies. We initially present 
in detail our scheme, and calculate the average backhaul rate 
of a MDS coded caching scheme. Afterwards, we prove that, 
given a placement scheme m = [mi • • • mjv], storing encoded 
packets always provides an advantage over storing fragments 
in terms of minimization of the backhaul rate. Finally, we 
formulate the MDS coded caching as an optimization problem, 
and show that hnding an optimal solution for the problem is 
tractable. 

A. MDS Coded Caching Scheme 

Maximum-distance separable codes are codes matching the 
singleton bound. In practice, using a MDS(a, h) code it is 
possible to create a encoded packets from b input fragments, 
such that any subset of b encoded packets is sufficient to 
recover the data. The most known example of MDS codes 
are the Reed-Solomon (RS) codes 03 . Lately, sub-optimal 
MDS codes were proposed, like fountain codes m, where 
6(1 -f e) encoded packets are needed for the recovery. 

In the proposed scheme, each SBS receives mj different 
MDS coded packets for each hie Fj to be stored in its cache, 
with m = [mi • • • mAr]. Moreover, the MBS stores n — mj 
encoded packets for each hie Fj in order to serve the requests. 
Formally, we propose the following: 

1) Placement phase: The MBS creates Ej = n + {Nsbs — 
l)mj encoded packets using a MDS(i?j,n) code. The 
MBS keeps n—mj encoded packets, and sends the other 
ones so that each SBS stores mj unique packets. 

2) Delivery phase: A user requesting hie Fj contacts du > 
1 SBSs, receiving mjdu different encoded packets. If 
rrijdu > n, the user can recover the hie due to the 
MDS-property of the code. Otherwise the MBS sends 
the remaining n — mjdu < n — mj encoded packets. 
Since the MBS kept n — mj encoded packets, the user 
does not receive replicated packets, and can decode the 
hie. 


j=i 

where S < Nsbs is the maximum number of SBSs serving a 
user 

Proof: Let u be a user served by du SBSs. If u requests 
hie Fj, exactly n packets have to be collected in order to 
retrieve Fj. Each SBS stores mj packets of the hie Fj, 
hence the user can collect dumj different encoded packets. 
If dumj > n the MBS does not send any packet, otherwise 
n — dumj packets are sent through the backhaul. The rate for 
user u requesting for hie Fj can consequently be calculated 
as 

^(Pj) = 1 - min (^1: . (2) 

A user has the probability yd to be served by d SBSs, hence 
the expected rate for the hie Fj is given by 


s 


= > , 7 d 1 - min 1 


d=l 


dmj 


Finally, each hie has a different probability to be requested 
by a user. Averaging over the request probability distribution 
P — [pi' ■ ’Pn], The average backhaul rate for the encoded 
caching placement scheme is given by 


S N 

I ^(^MDS^ = EE IdPj [ 1 - min ^ 1, 

d=i j=t 


dm. 


In order to evaluate the gain given by the use of MDS codes, 
we calculate the backhaul rate of a random caching scheme, 
where only hie fragmentation is exploited. In practice, given a 
placement m = [mi • • • ttiat], the MBS sends to each SBS mj 
different fragments randomly drawn among the n fragments 
of hie Fj. 

Proposition 2. The average backhaul rate for a random 
caching placement scheme Cm dehned by the placement 
m = [mi • • • rriAr] can be calculated as 

d=i i=i 

Proof: The proof is similar to the one proposed for 
Proposition [T] The main difference is the calculation of the 











rate for a user requesting for a file. For the given scheme, if 
a user u requesting file Fj is served by du SBSs, the average 
backhaul rate is 

This is due the fact that in this case the fragments are 
not unique since they can be replicated in different caches. 
Therefore the probability that a single fragment is not present 
in any of the caches is (1 — The rest of the proof 

is similar to the one of Proposition [T] ■ 

Now we can prove that, given a cache placement m = 
it is preferable in terms of backhaul usage to 
store MDS coded packets rather than fragments. 

Proposition 3. For any placement m = [mi • • • itin ], it holds 
that 

i?(CMDS) < R(c^)- (5) 

Proof: As stated in the proof of Proposition ID the 
difference between Equations ([T]i and ([Jll is given by (| 2 ]i and 
© respectively. In order to prove the proposition, we have to 
prove that @ is larger than or equal to I©. 

Since d > 1 and 0 < rrij/n < 1, from Bernoulli’s inequality 


We recall that the number of fragments n can be chosen, hence 
we can reformulate the optimization problem as 


S N 

min V V 7dPi(l - min(l, dq^)) 

qi,---,qN ^ ^ ^ ^ 

N 


s-t- 


(7) 


0<g'<l Vje[l,iV] 

where qj = rrij/n. It can be easily shown that (|7]) is a convex 
optimization problem, which concludes the proof. ■ 


IV. Numerical Illustrations 

In this section we investigate the performance of the optimal 
MDS coded caching scheme in terms of backhaul rate in 
a HetNet topology of particular relevance. We emphasize 
that our numerical results can be further generalized to any 
network topology. In particular the challenging problem of the 
optimization of the locations of the small-cell base stations is 
of considerable practical and theoretical interest, but outside 
the scope of this paper. 


/, 1714 , ,1714 
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V n / n 

and obviously 



We conclude the proof by noticing that 



C. Optimal MDS Coded Caching 

In the following, we study the placement problem of MDS 
encoded packets as an optimization problem, and show its 
tractability. Based on the average backhaul rate of Proposition 
[U we have the following result. 

Proposition 4. Finding the optimal MDS coded placement 
scheme defined by m(opt) = [mi ■ ■ ■ mAr], which 

minimizes the average backhaul rate i?(cMDS), is a convex 
optimization problem. 


Proof: The placement problem can be recast as the 
following optimization problem: 


S N 


mm 

mi. .,m;v 


d=i j=i 


EE IdPo ( 1 - min 1, 


dm. 


n 


N 


s.t. ^ mj = Mn 
i=i 

0 < mj < 71 


A. Network Topology 

In the following we consider the network depicted in Fig. [T] 
where the SBSs are deployed according to a regular grid with 
a distance d = 60 meters between each SBS. The macro-cell 
base station coverage area TZ has a radius of D = 500 meters. 
In order to reach any user in 72., each SBS has a coverage area 
of radius r such that dj^Ti < r < d, which means that the 
coverage areas are overlapping as shown in the highlighted 
square in Fig. [3 

If we denote by Ad and pd the total area of 72 where a user 
can be served by d SBSs and its average density, respectively, 
the probability 7 ^ that a user is served by d SBSs can be 
formally calculated as 


7d 


PdAd 

Si=l Pi-^i 


We note that for this particular deployment, the coefficients Ad 
can be theoretically approximated using simple geometrical 
calculations. 

In the following, we consider the users to be uniformly 
distributed in 72, with density Pd = P = 0.05 users/m^. 
These numbers correspond to 316 small-cell base stations 
being deployed, covering U = 31,415 mobile users. The 
request probability of the files p = [pi - ■ - pn] is distributed 
according to a Zipf law of parameter a, i.e.. 


1 / 7 “ 


Vj = l,...,iV 


( 6 ) 


where a represents the skewness of the distribution and can 
takes values in [0.5; 1.5[, see e.g. ifTSl . 








Cache Size M 

Fig. 4. Backhaul rate as a function of the cache size M, with N= 200 files, 
a = 0.7 and r = 60 meters. The lines without markers depict the use of 
MDS codes (encoded packets) in the placement, while the line with markers 
depict uncoded (fragments) placement. 

B. Caching Strategies 

In order to evaluate the performance of the optimal MDS 
coded placement scheme, we compare the optimal achiev¬ 
able backhaul rate to three other practical placement 

schemes, with and without MDS encoding of the hies: 

1) “most popular” placement C(pop) and C^°p®: each SBS 
stores the M most popular hies. Since entire hies are 
stored for this scheme, we note that C(pop) and are 
equivalent. Hence we call ^^^p) the rate achieved using 
this scheme. 

2) uniform placement C(unif) and the hie is divided 

into n fragments, which are placed uniformly at random 
at each SBS, i.e., each SBS stores M/N fragments of 
each hie. In the case that MDS codes are employed, 
the SBSs store M/N encoded packets for each hie 
instead of fragments. We call and i?(unif) the 

rates achieved using this scheme with and without MDS 
encoding, respectively. 

3) proportional placement C(pjop) and for each hie 

Fj, n fragments are created, along with a congruous 
number of encoded packets. The amount of parts rrij of 
hie Fj stored at the SBSs is proportional to its popularity 
Pj while satisfying the size constraints. In particular, 
the number of parts cached at the SBS is iteratively 
calculated as 

. ( 

rui = mm n,pi -- 

\ Mn 

We call and i?(prop) the rates achieved using this 

scheme with and without MDS encoding, respectively. 

C. Numerical Results 

In Fig. |4] we depict the backhaul rates as a function of the 
SBS cache size M. To begin with, it can be noticed that the 
use of MDS codes gives an advantage in terms of reduction of 
the backhaul usage, since MDS placement always outperforms 



Fig. 5. Backhaul rate as a function of the SBS coverage radius r, with 
N= 200 files, a = 0.7, and M = 20 files. 

the uncoded placement for each of the schemes. This behavior 
confirms the result presented in Proposition [3 Consequently, 
in the remainder of our analysis we only show the rates of 
MDS caching schemes. Secondly, we observe that, as it can be 
expected, the backhaul rates are decreasing when the storage 
capacity of the SBSs increases and that the optimal caching 
scheme outperforms the three other schemes. Furthermore, we 
observe that the difference between and the rate of the 

(opt) 

other schemes increases as the cache size increases. Finally, 
we can notice that the (pop) scheme performs the closest to 
the optimal scheme for small values of M, getting worse as 
M increases. We guess that this scheme is not able to really 
take advantage of the nature of future HetNets, i.e., of the 
overlapping between coverage areas of the SBSs. 

In Fig.|5]we confirm the previous conjecture by representing 
the backhaul rates as a function of the coverage radius r of 
the SBSs. We identify in the hgure a striking behavior for the 
^(pojf) placement scheme, as its rate does not depend on the 
coverage radius. This behavior is due to the fact that the most 
popular entire hies are stored at the SBSs for this scheme: 
being served by more SBSs does not increase the probability 
of having access to new hies since the same hies are stored in 
every cache. In contrast to the scheme, the performance 
of other schemes increases as the coverage area increases, and 
noticeably the optimal scheme signihcantly outperforms the 
other schemes as r grows. As a numerical illustration of this 
fact, we can imagine the scenario where the N = 200 hies 
are videos of size lOOMbits. When r = 60, the rate difference 
between and the second best scheme is 0.07. 

This represents a difference of backhaul load of 7Mbits/s if 
there is 1 demand per second in the macro cell, which is a 
considerable gain for current maximal backhaul capacities. 

Finally in Fig. |6] we illustrate the backhaul rates as a 
function of the library size N while the size of the edge caches 
remains constant. For small library sizes, the scheme 

is outperformed by the other schemes as a spreading of the 
library hies over the caches is more efficient than only storing 
a few number of entire hies. On the other hand, as the library 
size increases, the performance of the C(^°p® scheme becomes 
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Fig. 6. Backhaul rate as a function of the library size N, with a = 0.7, 
M= 20 files and r = 60 meters. 


better in comparison to the other schemes. The scheme, 

however, is still outperformed by the scheme, since we 

consider the overlapping scenario with coverage radius r = 60 
meters, which is unfavorable to the scheme. Moreover, 

(pop) 

we can notice that initially the scheme is really close to 

the optimal one. This is due to the fact that if the library size N 
is small compared to the cache size M, the optimal scheme is 
well approximated by the scheme that stores the packets 
uniformly across the files. As the library size increases, this 
naive scheme gets worse, since it does not take into account 
the popularity of the files. On the contrary, the performance 
of the C^rop) gets better, even if the gap with the optimal 
scheme becomes constant. This is due to the fact that this 
scheme does not truly exploit the possibility a user has to 
be served by more than one SBS. In general, the presence 
of a gap between the optimal scheme and sub-optimal ones 
highlights the importance of implementing the optimal caching 
placement scheme to exploit the overlapping coverage regions 
of the HetNet and thus to effectively decrease the backhaul 
load of the overall network. 

V. Conclusions 

We considered the problem of optimal MDS-encoded con¬ 
tent placement at the cache-equipped small-cell base stations 
at the wireless edge in order to minimize the overall back¬ 
haul load of the network. We derived the optimal caching 
placement strategy based on file splitting combined with MDS 
encoding, which we formulated as a convex optimization 
problem. We then deeply investigated the performance of 
this optimal placement for a relevant HetNet scenario by 
comparing it to existing caching strategies and by measuring 
the influence of the key parameters, such as the capabilities 
of the small-cell base stations and the library statistics. Our 
numerical observations, which can be easily generalized for 
any geometric topology, showed that optimizing the placement 
of MDS encoded packets at the wireless edge is of crucial 
importance, since it yields a significant decrease of the load 
of the network backhaul, and thus achieves a considerable 


improvement in terms of delay for content delivery to the end 
users. 
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